Bounds for Resequencing By Hybridization
نویسنده
چکیده
We study the problem of finding the sequence of an unknown DNA fragment given the set of its k-long subsequences and a homologous sequence, namely a sequence that is similar to the target sequence. Such a sequence is available in some applications, e.g., when detecting single nucleotide polymorphisms. Pe’er and Shamir studied this problem and presented a heuristic algorithm for it. In this paper, we give an algorithm with provable performance: We show that under some assumptions, the algorithm can reconstruct a random sequence of length O(4k) with high probability. We also show that no algorithm can reconstruct sequences of length Ω(log k · 4k).
منابع مشابه
Targeted Resequencing Reveals ALK Fusions in Non-Small Cell Lung Carcinomas Detected by FISH, Immunohistochemistry, and Real-Time RT-PCR: A Comparison of Four Methods
Anaplastic lymphoma receptor tyrosine kinase (ALK) gene rearrangements occur in a subgroup of non-small cell lung carcinomas (NSCLCs). The identification of these rearrangements is important for guiding treatment decisions. The aim of our study was to screen ALK gene fusions in NSCLCs and to compare the results detected by targeted resequencing with results detected by commonly used methods, in...
متن کاملResequencing Data of 20 Arabidopsis Ecotypes
This diploma thesis describes work on a chip resequencing project of 20 ecotypes belonging to the plant model species Arabidopsis thaliana, and these ecotypes are accessions from natural populations. Chip resequencing primarily aims at identifying single nucleotide polymorphisms (SNPs), the most abundant class of naturally occurring sequence variation. For resequencing, DNA microarrays are empl...
متن کاملThermoAlign: a genome-aware primer design tool for tiled amplicon resequencing
Isolating and sequencing specific regions in a genome is a cornerstone of molecular biology. This has been facilitated by computationally encoding the thermodynamics of DNA hybridization for automated design of hybridization and priming oligonucleotides. However, the repetitive composition of genomes challenges the identification of target-specific oligonucleotides, which limits genetics and ge...
متن کاملA bioinformatic filter for improved base-call accuracy and polymorphism detection using the Affymetrix GeneChip® whole-genome resequencing platform
DNA resequencing arrays enable rapid acquisition of high-quality sequence data. This technology represents a promising platform for rapid high-resolution genotyping of microorganisms. Traditional array-based resequencing methods have relied on the use of specific PCR-amplified fragments from the query samples as hybridization targets. While this specificity in the target DNA population reduces ...
متن کاملComparisons of substitution, insertion and deletion probes for resequencing and mutational analysis using oligonucleotide microarrays
Although oligonucleotide probes complementary to single nucleotide substitutions are commonly used in microarray-based screens for genetic variation, little is known about the hybridization properties of probes complementary to small insertions and deletions. It is necessary to define the hybridization properties of these latter probes in order to improve the specificity and sensitivity of olig...
متن کامل